A 9-billion-parameter large language model based on the DeepSeek-V3 architecture, trained from scratch using a fully open-source and exclusively English dataset of over 350 billion tokens, specifically designed for open-source community development and debugging.
Large Language Model
Transformers English